Method for in vitro molecular evolution of protein function

ABSTRACT

The present invention relates to a method for in vitro creation of molecular libraries evolution of protein function. Particularly, it relates to variability and modification of protein function by shuffling polynucleotide sequence segments. A protein of desired characteristics can be obtained by incorporating variant peptide regions (variant motifs) into defined peptide regions (scaffold sequence). The variant motifs can be obtained from parent DNA which has been subjected to mutagenesis to create a plurality of differently mutated derivatives thereof or they can be obtained from in vivo sequences. These variant motifs can then be incorporated into a scaffold sequence and the resulting coded protein screened for desired characteristics. This method is ideally used for obtaining antibodies with desired characteristics by isolating individual CDR DNA sequences and incorporating them into a scaffold which may, for example, be from a totally different antibody.

FIELD OF THE INVENTION

The present invention relates to a method for in vitro molecularevolution of protein function. Particularly, but not exclusively, itrelates to the shuffling of polynucleotide sequence segments within acoding sequence.

BACKGROUND OF THE INVENTION

Protein function can be modified and improved in vitro by a variety ofmethods, including site directed mutagenesis (Moore et al, 1987)combinatorial cloning (Huse et al, 1989; Marks et al, 1992) and randommutagenesis combined with appropriate selection systems (Barbas et al,1992).

The method of random mutagenesis together with selection has been usedin a number of cases to improve protein function and two differentstrategies exist. Firstly, randomization of the entire gene sequence incombination with the selection of a variant (mutant) protein with thedesired characteristics, followed by a new round of random mutagenesisand selection. This method can then be repeated until a protein variantis found which is considered optimal (Moore et al, 1996). Here, thetraditional route to introduce mutations is by error prone PCR (Leung etal, 1989) with a mutation rate of ≈0.7%.

Secondly, defined regions of the gene can be mutagenized with degenerateprimers, which allows for mutation rates up to 100% (Griffiths et al,1994; Yang et al, 1995). The higher the mutation rate used, the morelimited the region of the gene that can be subjected to mutations.

Random mutation has been used extensively in the field of antibodyengineering. In vivo formed antibody genes can be cloned in vitro(Larrick et al, 1989) and random combinations of the genes encoding thevariable heavy and light genes can be subjected to selection (Marks etal, 1992). Functional antibody fragments selected can be furtherimproved using random mutagenesis and additional rounds of selections(Hoogenboom et al, 1992).

The strategy of random mutagenesis is followed by selection. Variantswith interesting characteristics can be selected and the mutagenized DNAregions from different variants, each with interesting characteristics,are combined into one coding sequence (Yang et al, 1995). This is amulti-step sequential process, and potential synergistic effects ofdifferent mutations in different regions can be lost, since they are notsubjected to selection in combination. Thus, these two strategies do notinclude simultaneous mutagenesis of defined regions and selection of acombination of these regions. Another process involves combinatorialpairing of genes which can be used to improve e.g. antibody affinity(Marks et al, 1992). Here, the three CDR-regions in each variable geneare fixed and this technology does not allow for shuffling of individualCDR regions between clones.

Selection of functional proteins from molecular libraries has beenrevolutionized by the development of the phage display technology(Parmley et al, 1987; McCafferty et al, 1990; Barbas et al, 1991). Here,the phenotype (protein) is directly linked to its corresponding genotype(DNA) and this allows for directly cloning of the genetic material whichcan then be subjected to further modifications in order to improveprotein function. Phage display has been used to clone functionalbinders from a variety of molecular libraries with up to 10¹¹transformants in size (Griffiths et al, 1994). Thus, phage display canbe used to directly clone functional binders from molecular libraries,and can also be used to improve further the clones originally selected.

Random combination of DNA from different mutated clones is a moreefficient way to search through sequence space. The concept of DNAshuffling (Stemmer, 1994) utilises random fragmentation of DNA andassembly of fragments into a functional coding sequence. In this processit is possible to introduce chemically synthesised DNA sequences and inthis way target variation to defined places in the gene which DNAsequence is known (Crameri et al, 1995). In theory, it is also possibleto shuffle DNA between any clones. However, if the resulting shuffledgene is to be functional with respect to expression and activity, theclones to be shuffled have to be related or even identical with theexception of a low level of random is mutations. DNA shuffling betweengenetically different clones will generally produce non-functionalgenes.

SUMMARY OF THE INVENTION

At its most general the present invention provides a method of obtaininga polynucleotide sequence encoding a protein of desired characteristicscomprising the steps of incorporating at least one variant nucleotideregion (variant motif) into defined nucleotide regions (scaffoldsequence) derived from a parent polynucleotide sequence. The newassembled polynucleotide sequence may then be expressed and theresulting protein screened to determine its characteristics.

The present method allows protein characteristics to be altered bymodifying the polynucleotide sequence encoding the protein in a specificmanner. This may be achieved by either a) replacing a specified regionof the nucleotide sequence with a different nucleotide sequence or b) bymutating the specified region so as to alter the nucleotide sequence.These specified regions (variant motifs) are incorporated withinscaffold or framework regions (scaffold sequence) of the originalpolynucleotide sequence (parent polynucleotide sequence) which whenreassembled will encoded a protein of altered characteristics. Thecharacteristics of the encoded protein are altered as a result of theamino acid sequence being changed corresponding to the changes in thecoding polynucleotide sequence.

Rather than modifying a sequence at random and then relying on extensivescreening for the desired coded protein, the present inventors havefound it desirable to provide a method which modifies selected segments(variant motifs) of a protein while maintaining others.

The variant motifs may be segments of nucleotide sequence that encodespecified regions of a protein. For example, functional regions of aprotein (e.g. loops) or CDR regions in an antibody.

The scaffold sequence may be segments of nucleotide sequence which it isdesirable to maintain, for example they may encode more structuralregions of the protein, e.g. framework regions in an antibody.

The variant motifs may be nucleotide segments which originated from thesame polynucleotide sequence as the scaffold sequence, i.e. the parentpolynucleotide sequence, but which have been mutated so as to alter thecoding sequence from that in the parent. For example, the parentpolynucleotide sequence may encode an antibody. The nucleotide sequencesencoding the CDR regions of the antibody (variant motifs) may beselected from the remaining coding sequence of the parentpolynucleotide, mutated and then reassembled with scaffold sequencederived from the remaining coding sequence. The expressed antibody willdiffer from the wild type antibody expressed by the parentpolynucleotide in the CDR regions only.

Alternatively, the variant motif may be derived from a polynucleotidesequence encoding a protein sequentially related to the protein encodedby the parent polynucleotide sequence. For example, the CDR regions fromone antibody (antibody A) may be replaced by the CDR regions of anotherantibody (antibody B).

In each case the resulting expressed protein can be screened for desiredcharacteristics. Desirable characteristics may be changes in thebiological properties of the protein. For example, the tertiarystructure of the protein may be altered. This may affect its bindingproperties, the ability for it to be secreted from cells or into cellsor, for enzymes, its catalytic properties. If the protein is an antibodyor part thereof it may be desirable to alter its ability to specificallybind to an antigen or to improve its binding properties in comparison tothe parent antibody.

According to one aspect of the present invention, there is provided amethod of obtaining a protein of desired characteristics byincorporating variant peptide regions (variant motifs) into definedpeptide regions (scaffold sequence), which method comprises the stepsof:

(a) subjecting parent polynucleotide sequence encoding one or moreprotein motifs to mutagenesis to create a plurality of differentlymutated derivatives thereof, or obtaining parent polynucleotide encodinga plurality of variant protein motifs of unknown sequence,

(b) providing a plurality of pairs of oligonucleotides, each pairrepresenting spaced-apart locations on the parent polynucleotidesequence bounding an intervening variant protein motif, and using eachsaid pair of oligonucleotides as amplification primers to amplify theintervening motif;

(c) obtaining single-stranded nucleotide sequence from the thus-isolatedamplified nucleotide sequence; and

(d) assembling nucleotide sequence encoding a protein by incorporatingnucleotide sequences derived from step (c) above with nucleotidesequence encoding scaffold sequence.

The method may further comprise the step of expressing the resultingprotein encoded by the assembled nucleotide sequence and screening fordesired properties.

Preferably the parent polynucleotide sequence is DNA from which isderived DNA sequences encoding the variant motifs and scaffoldsequences.

Preferably the pairs of oligonucleotides are single-strandedoligonucleotide primers. One of said pair may be linked to a member of aspecific binding pair (MSBP). The MSBP is preferably biotin, whosespecific binding partner could for example be streptavidin. By using thespecific binding pair the amplified nucleotide sequences may beisolated.

Random mutation can be accomplished by any conventional method; but asuitable method is error-prone PCR.

The protein in question could, for example, be an antibody or antibodyfragment having desirable characteristics. Example of antibodyfragments, capable of binding an antigen or other binding partner, arethe Fab fragment consisting of the VL, VH, Cl and CHI domains; the Fdfragment consisting of the VH, and CHl domains; the Fv fragmentconsisting of the VL and VH domains of a single arm of an antibody; thedAb fragment which consists of a VH domain; isolated CDR regions andF(ab′)2 fragments, a bivalent fragment including two Fab fragmentslinked by a disulphide bridge at the hinge region. Single chain Fvfragments are also included.

In one approach, after randomly mutating DNA encoding the antibody, or aportion of that DNA (eg that which encodes the Fab regions or variableregions), oligonucleotide primers could be synthesised corresponding tosequences bounding the CDRs (the variant motifs), so that DNA encodingthe CDRs are amplified, along with any mutations that may have occurredin the CDRs. These can be incorporated in the reassembly of the antibodycoding sequence, using the amplified CDR DNA sequences and the unmutatedscaffold framework (FR) DNA sequences, resulting in the expression of anantibody which has a novel combination of CDRs, and potentially havingaltered properties which can be selected or screened for in conventionalmanner.

In another approach, rather than mutate CDRs and reassembling them backinto an antibody which will be closely related to the parent antibodyfrom which the CDRs were derived, the CDRs may be taken from one or moreexisting antibodies, but be of unknown sequence. Using oligonucleotideprimers representing sequences bounding the various CDRs, the individualCDRs can be amplified, isolated and assembled into a predeterminedscaffold.

Of course, combinations of the foregoing approaches could be used, withCDRs taken from one or more parent antibodies, and assembled into ascaffold to produce a completely new, secondary antibody, then, afterscreening to obtain a secondary antibody with desired characteristics,the DNA encoding it could be mutated, the CDRs amplified and isolated,and then reassembled with unmutated non-CDR (scaffold) DNA from thesecondary antibody, to produce variants of the secondary antibody whichare mutated in the CDRs, and which can be screened for improvedproperties with respect to the originally selected secondary antibody.

The present invention allows a novel way for the isolation of DNAsequences from genetically related clones that are functionallydifferent. Genetically related clones are those that belong to aparticular structural class, for example immunoglobulins oralpha-beta-barrels. The invention allows for both isolation and randomcombination into a given DNA sequence of functional sequences from theserelated clones. These functional sequences may be loops that performbinding or catalysis.

The concept of the invention is demonstrated using antibody moleculeswhere CDR-regions from different germline sequences can be isolated andrandomly combined into a defined framework sequence. The inventionexpands the complexity of the molecular libraries that can be selectedusing phage display. The concept of the invention is also demonstratedby the affinity maturation of antibody fragments by the isolation andrandom combination of mutated CDR-regions.

It is not possible to use the DNA shuffling concept (Stemmer, 1994) toisolate specific sequences and randomly combine these into a given genesequence, as it is not possible to amplify individual DNA regions formedin vivo using DNA shuffling. Combination of entire gene sequences ispossible, but here defined regions cannot be shuffled. Rather all theDNA is shuffled. Thus, DNA sequences from genetically related clonesthat are functionally different, eg proteins that belong to structuralclasses like immunoglobulins or alpha-beta-barrels, cannot be shuffledin such a way that specific regions are kept constant and other regionsare shuffled.

The system provided by the present invention offers a simple way torandomly combine functional regions of proteins (eg loops) to a defined(specifically selected) scaffold, ie shuffling of loops to a givenprotein tertiary structure in order to find new protein functions.Furthermore, the DNA shuffling technology introduces mutations at a rateof 0.7% (Stemmer, 1994). Thus, the known DNA shuffling technology(Stemmer, 1994) does not allow for shuffling of unmutated regions, sincethe process itself introduces mutations at random positions, includingthe scaffold regions.

In contrast, the invention allows for mutagenesis of definedDNA-sequences together with shuffling and assembly of these pieces ofDNA into a coding region, and will allow for mutagenesis of definedregions and subsequent selection of these regions in combination.

The invention allows for different regions of DNA from differentsequences (clones) to be shuffled and randomly combined. This increasesthe genetic variation from which functional antibody fragments areselected and will thus increase the probability of selecting proteinswith the desired characteristics. It can be realised that by randomlyshuffling as few as a hundred CDRs at each position in the VH and VL ofan fragment, as many as 10¹² combinations may be obtained therebyextending the variability normally found in the immune system.

The invention provides amplification of defined regions from eg a cDNAlibrary using two primers of which one is biotinylated. Using the MSBP,e.g. biotin, group, single stranded DNA can be isolated and used in thegene assembly process. The present inventors have demonstrated this withthe amplification of diverse CDR regions from an antibody gene libraryand the combination of these CDR regions randomly to a given frameworkregion. Thus, defined regions of DNA (framework regions) can beinterspaced by random regions of DNA (CDR regions), which have an invivo origin or can be chemically synthesized.

The present invention also provides polynucleotide sequences and theproteins they encoded produced by the method described above. There isalso provided vectors incorporating the polynucleotide sequences andhost cell transformed by the vectors.

The present invention also provides a polynucleotide library comprisingpolynucleotides created by the method described above which may be usedfor phage display.

Aspects and embodiments of the present invention will now beillustrated, by way of example, with reference to the accompanyingfigures. Further aspects and embodiments will be apparent to thoseskilled in the art. All documents mentioned in this text areincorporated herein by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows shuffling of specific DNA sequences between differentclones, based on the assembly of gene sequences from a set ofoverlapping oligo-nucleotides following a one-step PCR protocol.

FIG. 2 shows different dissociation rate constants for differentCDR-shuffled clones. A low bar represents slow dissociation-rate, a highbar represents a fast dissociation-rate. Clone 36 is the originalnon-mutated antibody fragment.

FIG. 3 shows the results of affinity purified scFv antibody fragmentassayed on HPLC, Superose S-200 FPLC-column (Pharmacia) in PBS buffer.Peak 1 is the monomeric form of the antibody fragment, peak 2 is a smallamount of impurity and peak 3 is NaN3 (sodium azid), used as apreservative.

FIG. 4 shows a schematic representation of amplification of definedsequences of DNA and the shuffling of these into a master framework.Only the CDR regions are amplified. FIG. 4A: Assembly of genes for theVH-domain. The template is scFv-B11 mutated with error prone PCR. Anindividual CDR is amplified using two primers adjacent to the particularCDR and one of these primers is biotinylated at the 5′ end. Theindividual CDR is amplified and double-stranded DNA (dsDNA) is producedwith the mutations focused to the CDR since the two amplificationprimers do not contain any mutations. This DNA is separated into twosingle stranded DNA molecules. The molecule without biotin is used ingene assembly. Primers 725, 729, 730, 728, 727 are synthesized in a DNAsynthesizer and primers H2, H3, H5 contain mutated CDR and are amplifiedas above. FIG. 4B: Assembly of genes for the VL-domain. CDRs areamplified in the same way as in A. Primers 759, 738, 745, 744, 880 aresynthesized in a DNA synthesizer and primers L2, L3, L5 contain mutatedCDR and are amplified as above.

FIG. 5 shows the alignment of the peptide sequences for clones 3, 11 and31 with the original non-mutated antibody fragment (wt). The CDR-regionsare marked. Mutations in clones 3, 11 and 31 are underlined.

FIG. 6 shows the principles for the isolation of single-stranded DNA forthe shuffling of defined DNA regions.

FIG. 7 shows the length of CDR3 heavy chain from different clones. TheseCDR regions have been amplified from different germline sequences andrandomly cloned to a defined framework region (from DP-47 sequence).

FIG. 8 shows a schematic representation of amplification of definedsequences of DNA and the shuffling of these into a master framework. Allthe oligonucleotides used in the gene assembly are amplified by PCR, butonly the CDR regions contain any genetic variation. FIG. 8A: Assembly ofgenes for the VH-domain. The template for the framework regionamplification is scFv-B11, whereas CDRs are amplified from cDNA preparedfrom peripheral blood lymphocytes, tonsils and spleen. An individual DNAfragment is amplified using two primers located at the ends of thefragments to be amplified and one of these primers is biotinylated atthe 5′ end. The individual DNA fragment is amplified and double-strandedDNA (dsDNA) is produced. This DNA is separated into two single strandedDNA molecules. The molecule without biotin is used in gene assembly,i.e. primers H1, H4, H6 and these primers contain no variation. PrimersHCDR1, HCDR2, HCDR3 contain different CDR and are amplified using twoprimers adjacent to the particular CDR and one of these primers isbiotinylated at the 5′ end. The individual CDR is amplified anddouble-stranded DNA (dsDNA) is produced with the variation focused tothe CDR since the two amplification primers do not contain anymutations. This DNA is separated into two singled stranded DNA moleculesand used in gene assembly of VH domain in a library format, i.e. thevariation in the CDRs is derived from different germ-line sequences.Primers BT25 and BT26 are synthesized in a DNA-synthesizing machine.FIG. 8B: Assembly of genes for the VL-domain. In principle the sameprocedure as in A. Primers L1, L4, L6 are amplified and produced by PCRand contain no variation. LCDR1, LCDR2, LCDR3 contain different CDR.Primers BT7 and BT12 are synthesized in a DNA-synthesizing machine.

FIG. 9 shows the variation in a library constructed according to FIG. 8.The scFv region of library clones and original scFv-B11, binding to FITC(fluorescein-iso-thiocyanate) was synthesized by PCR. Purified PCRproducts were cut with BstNI and separated on a 2.5% agarose gel. Clones1-15 are in lane 2-16, clones 16-29 are in lane 18-31. Original scFv-B11is in lane 32. Analysis revealed that 28 clones could be sorted in 13different groups according to restriction pattern and fragment size.Eight clones (1, 2, 8, 10, 12, 16, 26, 27) were unique, 2 clones (17,24) appeared similar, 1 group of clones (18, 23, 29) had 3 similarmembers, 2 groups (5, 15, 14, 19) and (3, 4, 6, 11) had 4 members and 1group (7, 9, 13, 20, 21, 22, 25) had 7 similar members. This experimentunderestimates the variation in the library since BstNI detects only afraction of sequence variability. In addition, the gel resolution didnot allow the detection of minor size differences and did not resolvefragments below 100 bp.

FIG. 9B shows clones showing similar restriction pattern in theexperiment exemplified in FIG. 9A cut by both BstNI and BamHI andseparated on 3% agarose gels. To facilitate comparison, the groups ofsimilar clones described in experiment A were put together on the gels.Clone 8 and 28 from experiment A were excluded due to space limitations.

Gel I) Lane 1-8; standard, clone 5,15,14,19,2,27, original scFv-B11,respectively Gel II) Lane 1-8; standard, clone 16,17,24,18,23,29,26,respectively Gel III) Lane 1-8; standard, clone 7,9,13,20,21,22,25,respectively Gel VI) Lane 1-8; standard, clone 3,4,6,11,1,10,12,respectively

Under these improved experimental conditions, essentially all clones haddifferent restriction patterns/fragments sizes. All clones weredifferent from the original scFv-B11 gene (lane 8, gel 1). Moreover, thegroups of clones which appeared similar in FIG. 9A were found to bedifferent as analyzed in FIG. 9B. See clone 5,15,14,19 (lanes 2-5 gelI), clone 17,24 (lanes 3-4 gel II), clone 18,23,29 (lanes 5-7 gel II),clones 7,9,13,20,21,22,25, (lanes 2-8, gel III) and clones 3,4,6,11(lanes 2-5 gel IV).

In conclusion, these experiments suggest that the library contains highvariability.

DETAILED DESCRIPTION AND EXEMPLIFICATION OF THE INVENTION

One aspect of the DNA shuffling procedure can be illustrated by thefollowing steps in FIG. 1.

A: A gene coding for a protein of interest is divided into overlappingoligonucleotides.

B: The oligonucleotides are assembled using PCR into a full lengthgene-sequence.

C: The gene sequence is subjected to mutagenesis, eg by error-prone PCR.

D: Pairs of oligonucleotides are synthesized, each pair covering aregion defined by one of the oligonucleotides in step A above, exceptfor a region located in the middle of the step A oligonucleotide. Thisuncovered region is the DNA sequence that can be shuffled after PCRamplification. These two synthesised oligonucleotides can thus be usedas amplification primers to amplify the uncovered region.

E: One of these amplification primers is biotinylated and thedouble-stranded PCR product can then be isolated using well-knownstrepavidin systems.

F: From the thus isolated amplified oligonucleotides can be obtained asingle-stranded DNA sequence containing DNA from the uncovered regionmentioned above, which can then be used as oligo-nucleotide in a newassembly of the gene sequence as described in step A.

G: If DNA sequences from different clones and from different regions ofthe mutated gene sequence are amplified and made single-stranded, theywill combine randomly in the PCR process of gene assembly. This randomcombination is the basis for in vitro molecular evolution.

EXAMPLES

The present inventors have demonstrated the concept of shuffling ofdefined DNA in different experimental settings. Firstly, the shufflingof in vitro mutated CDR regions in an antibody fragment for affinitymaturation purposes (example 1 and 2) is exemplified and secondly theshuffling of in vivo formed CDRs for creation of a highly variableantibody library (example 3 and 4) is exemplified.

1. Affinity Maturation

A model system was developed, based on the scFv-B11 antibody fragmentwhich binds to FITC. The full-length gene encoding this scFv wasassembled from a set of 12 oligonucleotides (FIG. 4A and FIG. 4B)representing the known DNA sequence of the scFv-B11, and the functionalbinding of the gene product to FITC could be verified. This genesequence was then mutagenized using error-prone PCR, and the DNAencoding the CDR regions were amplified as described above, using theamplification primers, one of which is biotinylated. (The CDR regionsare the parts of the antibody molecule involved in binding the antigen,in this case FITC).

All six CDR regions were amplified and a new gene was assembled usingsix oligonucleotides selected from the first assembly of 12oligonucleotide (see above) (these were not mutagenized) and six fromthe amplification of mutagenized CDR regions. Selection of functionalantibody fragments that bound FITC was carried out using phage display.50% of the clones bound FITC with different dissociation-rates than didthe original scFv-B11, as measured in the BIAcore biosensor (FIG. 2).This demonstrates that the clones were changed in the way theyrecognized FITC.

Of the 16 clones identified to bind FITC in BIAcore (FIG. 2) clones 3,11, 27 and 31 were chosen to be analyzed in more detail as these clonesexhibited the larger changes in off-rates. These clones were expressedand affinity-purified on a column conjugated with FITC-BSA and elutedwith a low pH buffer. The purified scFv-antibody fragments were furtherpurified and analyzed with HPLC, using a Pharmacia Superdex 200 FPLCcolumn with the capacity to separate the monomeric and dimeric form ofthe antibodies. In all clones the monomeric form dominated (typical sizeprofile is shown in FIG. 3). This was then purified and used in detailedanalysis of affinity using a BIAcore biosensor (Table 1).

TABLE 1 Affinity determination of selected. Clone k_(ASS)(M⁻¹ S⁻¹)k_(DISS) (S⁻¹) K_(A) (M⁻¹) #3 2.0 × 10⁵ 4.3 × 10⁻³ 4.8 × 10⁷ #11 2.6 ×10⁵ 3.3 × 10⁻³ 7.8 × 10⁷ #27 5.0 × 10⁵ 16.0 × 10⁻³  3.1 × 10⁷ #31 1.2 ×10⁵ 5.4 × 10⁻³ 2.1 × 10⁷ (FITC-B11 original) 2.7 × 10⁵ 9.7 × 10⁻³ 2.8 ×10⁷

Clone #11 exhibited an affinity 2.8 times higher than the originalscFv-B11 antibody fragment. This increase is based on a slower off-rate.One clone (#27) showed 2 times increase in association-rate. However,the overall affinity of this clone was similar to the original FITC-B11clone due to a faster dissociation-rate. The distribution of differentassociation and dissociation-rates among the clones was considered asource for CDR-reshuffling for further improvement of affinities.

Three clones were sequenced. In the VH region (ie half of the scFv-B11and carrying three CDR regions) the mutations found were all in the CDRregions as expected, since these were the only regions mutagenized andamplified using the amplification primers. Interestingly, all the CDRregions were different and carried different mutations (FIG. 5).However, in the case of CDR region 2, the same mutation was found (atyrosine to histidine substitution) in all 3 clones (the rest of CDRregions differed between the clones).

Furthermore, the mutation rates were found to be in between 2% and 4%,as determined from the base changes in the 90 bp long sequence built upfrom three CDR regions together. This is more than the error-prone PCRmutation rate, and indicates that there is combination of individual CDRregions from different clones.

2. Affinity Maturation-Reshuffling

In order to perform a second shuffling (reshuffling), clones selectedfor their binding affinity to FITC were used in an additional round ofCDR-amplification and library construction. In theory, the reshuffledlibrary will contain mutated shuffled CDR-regions, selected for improvedbinding to FITC. In this way, new combinations of CDR-regions, improvedwith respect to binding, could be constructed and the library subjectedto selection for binders with improved affinities.

The pool of all clones obtained from the selection procedure (asdetailed in example 1) were used as template for CDR amplifications. Oneamplification was carried out for each CDR using primers listed in Table2.

TABLE 2 Sequences for primers used in CDR-shuffling. CDR ReamplificationPrimers 764 5′ B-GTC CCT GAG ACT CTC CTG TGC AGC CTC TGG ATT CAC CTT T3′ 875 5′ TCC CTG GAG CCT GGC GGA CCC A 3′ 876 5′ CGC CAG GCT CCA GGGAAG GGG CTG GAG TGG GTC TCA 3′ 765 5′ B-GGA ATT GTC TCT GGA GAT GGT GAA3′ 799 5′ GAG CCG AGG ACA CGG CCG TGT ATT ACT GTG CAA GA 3′ 766 5′ B-GCGCTG CTC ACG GTG ACC AGG GTA CCT TGG CCC CA 3′ 767 5′ B-AGC GTC TGG GACCCC CGG GCA GAG GGT CAC CAT CTC TTG T 3′ 800 5′ GGG CCG TTC CTG GGA GCTGCT GGT ACC A 3′ 801 5′ GCT CCC AGG AAC GGC CCC CAA ACT CCT CAT CTA T 3′768 5′ B-GAC TTG GAG CCA GAG AAT CGG TCA GGG ACC CC 3′ 802 5′ CTC CGGTCC GAG GAT GAG GCT GAT TAT TAC TGT 3′ 769 5′ B-CGT CAG CTT GGT TCC TCCGCC GAA 3′ Framework VH 727 5′ CCG CCG GAT CCA CCT CCG CCT GAA CCG CCTCCA CCG CTG CTC ACG GTG ACC A 3′ 728 5′ GAC CGA TGG ACC TTT GGT ACC GGCGCT GCT CAC GGT GAC CA 3′ 729 5′ GAG GTG CAG CTG TTG GAG TCT GGG GGA GGCTTG GTA CAG CCT GGG GGG TCC CTG AGA CTC TCC TGT 3′ 730 5′ GGC CGT GTCCTC GGC TCT GAG GCT GTT CAT TTG CAG ATA CAG CGT GTT CTT GGA ATT GTC TCTGGA GAT GGT 3′ Framework VL 738 5′ CAG TCT GTG CTG ACT CAG CCA CCC TCAGCG TCT GGG ACC CCC G 3′ 744 5′ ACT AGT TGG ACT AGC CAC AGT CCG TGG TTGACC TAG GAC CGT CAG CTT GGT TCC TCC GC 3′ 745 5′ CTC ATC CTC GGA CCG GAGCCC ACT GAT GGC CAG GGA GCC TGA GGT GCC AGA CTT GGA GCC AGA GAA TCG 3′1129 5′ CAG GCG GAG GTG GAT CCG GCG GTG GCG GAT CGC AGT CTG TGC TGA CTCAGC CAC CCT CAG CGT CTG GGA CCC CCG 3′ Amplification primers VH\TLAssembly 1125 5′ ACT CGC GGC CCA ACC GGC CAT GGC CGA GGT GCA GCT GTT GGAG 3′ 1126 5′ CAA CTT TCT TGT CGA CTT TAT CAT CAT CAT CTT TAT AAT CAC CTAGGA CCG TCA GCT TGG T 3′ B = Biotin labeled 5′ primer

The amplification was performed according to following parameters: 100ng template (1.6×10⁸ CFU bacteria grown for 6 h), 60 pmol each primer, 5Units PFU polymerase (Stratagene), 1×PFU buffer, 500 μM dNTPs, reactionvolume 100 μl, preheat 96° C. for 10 minutes, 96° C. for 1 minute: 68°C. for 1 minute: 72° C. for 1 minute for 25 cycles, 72° C. for 10minutes. This procedure was essentially the same as for CDRamplification in Example 1. The amplified CDR were used for assemblyinto VH and VL encoding sequence according to FIGS. 1, 4A, 4B and Table3.

TABLE 3 PCR parameters for the assembly of VH and VL gene sequences inCDR-shuffling VL VH Primer 759 Primer 725 30 pmol Primer 738 Primer 7290.6 pmol Primer L2 Primer H2 0.6 pmol Primer L3 Primer H3 0.6 pmolPrimer 745 Primer 730 0.6 pmol Primer L5 Primer H5 0.6 pmol Primer 744Primer 728 0.6 pmol Primer 880 Primer 727 30 pmol Taq Taq 10 Units dNTPsdNTPs 200 μM 1x Taq buffer 1x Taq buffer to 100 μl Preheat 95° 10minutes, 20 cycles: 95° 1 minutes, 68° 1 minutes, 72° 1 minutes 72° 10minutes.

The VH and VL were then assembled into a scFv encoding sequenceaccording to standard procedures (Griffiths et al 1994). The resultinglibrary was subjected to panning so as to select binders with improvedaffinities to FITC. The selection procedure for the reshuffled librarywas essentially the same as for the initially shuffled library. Thetotal number of clones obtained after selection was 510. Six clones (B)were chosen from this new pool and were tested and compared to 6 clones(A) from the first pool, originating from the shuffled library (Table4).

TABLE 4 Dissociation-rates of individual clones selected from theshuffled library (clones A) and from the reshuffled library (clones B).Clone K_(DISS) (s−1 × 10⁻³) scFv-B11 (original) 12.9  1A 6.3 12A 5.7 13A9.0 14A 9.7 16A 1.8 17A 7.9 22B 0.2 31B 0.3 32B 9.8 33B 6.8 34B 7.3 35B8.7

Two clones from the reshuffling experiments (22B and 31B) exhibitedsubstantially slower dissociation-rates, indicating that the reshufflingprocess yielded binders with improved affinities.

3. Cloning and Shuffling of Defined DNA Regions

In our system it is possible to amplify defined regions from a cDNAlibrary using two primers of which one is biotinylated. Using the biotingroup, single stranded DNA can be isolated an used in the gene assemblyprocess (FIG. 6). We have demonstrated this with the amplification ofdiverse CDR regions from an antibody gene library and the combination ofthese CDR regions randomly to a given framework region. Thus, definedregions of DNA (framework regions) can be interspaced by random regionsof DNA (CDR regions) which have an in vivo origin (Table 5). The CDR3region vary in size (FIG. 7). Alternatively, these regions could bechemically synthesised.

TABLE 5 Combination of CDR regions from different germline sequencestransplanted to the DP-47 framework encoding the variable heavy domain.For CDR1 and CDR2 the suggested germline origin is indicate. For CDR3the number of residues in the CDR-region is written. N.D = notdetermined. Clone CDR1 CDR2 CDR3 1 DP-35 DP-42 12 2 DP-49 DP-53 13 3N.D. DP-51 11 4 DP-32 DP-47 10 5 DP-41 DP-47 8 6 DP-32 DP-77 9 7 DP-31DP-47 7 8 DP-49 DP-35 5 9 DP-49 DP-35 N.D. 10 DP-48 DP-48 N.D. 11 DP-51DP-47 10 12 DP-34 DP-31 N.D. 13 DP-85 DP-53 4 14 DP-31 DP-77 10 15 DP-34DP-53 4

4. Library Construction

A gene library was constructed encoding scFv antibody fragments. Thestrategy used for this library is based on the assembly of a set ofoligonucleotides into a sequence encoding VH and VL antibody domains(FIGS. 8A, 8B.) Native in vivo formed CDR regions can be shuffled andassembled into a given master framework. In this example we havedeveloped this concept further and assembled both VH and VL encodinggene sequences with native CDR regions into a given master framework.Thus, all six CDR positions have been shuffled. The template origin forCDR amplification was cDNA from peripheral blood B-cells, spleen,tonsils and lymphnodes. Oligonucleotides encoding the framework regionshave also been amplified using the strategy with two flanking primers,where one is biotinylated (primers L1, H1 L4, H4, L6, H6). The primersused are described in Table 6 and in FIG. 8A, BB.

TABLE 6 Sequences for primers used in library construction.Amplification of framework fragments BT1. 5′ ACA GTC ATA ATG AAA TAC CTATTG C 3′ BT2. 5′ B-GC ACA GGA GAG TCT CA 3′ BT3. 5′ B-CA CCA TCT CCA GAGACA ATT CC 3′ BT4. 5′ GGC CGT GTC CTC GGC TCT 3′ BT5. 5′ B-TG GTC ACCGTG AGC AGC 3′ BTG. 5′ CCG CCG GAT CCA CCT 3′ BT7. 5′ CAG GCG GAG GTGGAT CCG GC 3′ BT8. 5′ B-CG GGG GTC CCA GAC GCT 3′ BT9. 5′ B-CG ATT CTCTGG CTC CAA GT 3′ BT10. 5′ CTC ATC CTC GGA CCG GA 3′ BT11. 5′ B-TC GGCGGA GGA ACC AAG CT 3′ BT12 5′ TGG CCT TGA TAT TCA CAA ACG AAT 3′Amplification of in vivo CDR BT13. 5′ B-TC CCT GAG ACT CTC CTG TGC AGCCTC TGG ATT CAC CTT 3′ BT14. 5′ TTC CCT GGA GCC TGG CGG ACC CA 3′ BT15.5′ B-GG AAT TGT CTC TGG AGA TGG TGA A 3′ BT16. 5′ GTC CGC CAG GCT CCA 3′BT17. 5′ B-CG CTG CTC ACG GTG ACC AGT GTA CCT TGG CCC CA 3′ BT18. 5′ AGAGCC GAG GAC ACG GCC GTG TAT TAC TGT 3′ BT19. 5′ B-AG CGT CTG GGA CCC CCGGGC AGA GGG TCA CCA TCT CTT 3′ BT20. 5′ GGG CCG TTC CTG GGA GCT GCT GATACC A 3′ BT21. 5′ GCT CCC AGG AAC GGC CCC CAA ACT CCT CAT CTA T 3′ BT22.5′ B-GA CTT GGA GCC AGA GAA TCG GTC AGG GAC CCC 3′ BT23. 5′ B-GT CAG CTTGGT TCC TCC GCC GAA 3′ BT24. 5′ CTC CGG TCC GAG GAT GAG GCT GAT TAT TACT 3′ Assembly of VH and VL 5T25. 5′ B-TA CCT ATT GCC TAC GGC AGC CGC TGGATT GTT ATT ACT CGC GGC CCA GCC GGC CAT GGC CGA 3′ BT2G. 5′ CCG CCG GATCCA CCT CCG CCT GAA CCG CCT CCA CCG CTG CTC ACG GTG ACC A 3′Amplification primers 2^(nd) assembly BT27. 5′ B-TGG CCT TGA TAT TCA CAAACG AAT 3′ BT28. 5′ B-ACG GCA GCC GCT GGA TTG 3′ B = Biotin labeled5′ primer

The PCR parameters for CDR and framework region amplification wereessentially the same as described in example 2. The PCR parameters forassembly of genes encoding VH and VL are described in Table 7.

TABLE 7 PCR parameters for the assembly of VH and VL gene sequences forlibrary construction. VH VL Primer BT25 Primer BT7 30 pmol Primer H1Primer L1 0.6 pmol Primer HCDR1 Primer LCDR1 0.6 pmol Primer HCDR2Primer LCDR2 0.6 pmol Primer H4 Primer L4 0.6 pmol Primer HCDR5 PrimerLCDR3 0.6 pmol Primer H6 Primer L6 0.6 pmol Primer BT26 Primer BT12 30pmol Taq Taq 10 Units dNTPs dNTPs 200 μM 1x Taq buffer 1x Taq buffer to100 μl Preheat 95° 10 minutes, 20 cycles: 95° 1 minutes, 68° 1 minutes,72° 1 minutes and 72° 10 minutes.

The assembled VH and VL gene sequences were assembled into a scFv codingsequence using standard protocols (Griffiths et al 1994). A library of1.1×10⁹ members were constructed out of the 40 clones tested all 40contained an insert of the right size as determined by PCR agarose gelelectrophoresis. In order to test the variability in the library, PCRamplified and purified inserts were subjected to cleavage by BsTN1 andBamH1. Clones showed different restriction patterns, as determined byagarose gel electrophoresis and compared to the control scFv-B11 (FIG.9).

In order to estimate the frequency of clones able to express scFvantibody fragments, clones from the library containing the FLAG sequence(Hopp et al, 1989), as well as control bacteria with and without FLAGsequence, were plated at low density on Luria broth-plates containing100 μg/ml ampicillin, 25 μg/ml tetracycline and 1% glucose. The plateswere grown at 37° C. over night and lifted to nitrocellulose filters bystandard methods (Sambrook et al 1989). In order to induce synthesis ofthe scFv genes in the bacteria, filters were incubated for 4 hrs onplates containing 0.5 mM isopropyl-thio-β-D-galactoside (IPTG) butwithout glucose. Bacteria were then lysed by lyzosyme/chloroformtreatment, the filters were washed and incubated with anti-FLAG M2antibody (Kodak) followed by anti-mouse peroxidase conjugated secondantibody (P260 Dakopatts) and detected by DAB 3,3′-diaminobenzidinetetrahydroklorid, Sigma) (Table 8).

TABLE 8 Frequency of intact antibody genes in the library FLAG positivePercent positive Library Pool Tested clones clones clones A 145 88 60 B77 52 67 C 158 105 66 D 68 48 70 All library 448 293 65.4 pools Positivecontrol 64 64 100 pFAB5cHis scFvB11 Negative control 30 0 0 pFAB5cHis

The anti-FLAG antibody detects a FLAG sequence situated downstream ofthe scFv gene in the library constructs as well as in the control vectorpFAS5cHis scFvB11, but not in the original vector pFAB5cHis. Clones, towhich the anti-FLAG antibody binds, therefore contains an intact openreading frame of the scFv gene.

REFERENCES

-   Barbas, C F et al: Proc Natl Acad Sci USA, 88:7978-82 (1991)-   Barbas, C F et al: Proc Natl Acad Sci USA, 89:4457-61 (1992)-   Crameri, A et al: Biotechniques, 18:194-196 (1995)-   Griffiths, A D et al: EMBO J, 13:3245-3260 (1994)-   Hoogenboom, H R et al: J Mol Biol, 227:381-8 (1992)-   Hopp, T. P. et al: Biotechniques 7: 580-589 (1989)-   Huse, W D et al: Science, 246:1275-81 (1989)-   Larrick, J W et al: Biochem Biophys Res Commun, 160:1250-6 (1989)-   Leung, D W et al: Technique, 1:11-15 (1989)-   Marks, J D et al: Biotechnology, 10:779-83 (1992)-   McCafferty, J et al: Nature, 348:552-4 (1990)-   Moore, J C et al: Nature Biotechnology, 14:458-467 (1996)-   Parmley, S F et al: Gene, 73:305-318 (1988)-   Roberts, S et al: Nature, 328:731-4 (1987)-   Sambrook, J et al: Molecular cloning. A laboratory manual. Cold    spring Harbor Laboratory Press 1989.-   Stemmer, W P: Nature, 370:389-391 (1994)-   Yang, W P et al: J Mol Biol, 254:392-403 (1995)

1-55. (canceled)
 56. A method of generating an assembled polynucleotidesequence encoding a protein of desired characteristics comprising thesteps of: a) providing at least one polynucleotide sequence comprisingone or more variant polynucleotide sequences encoding one or more invivo formed variant protein motifs; b) providing one or more pairs ofdefined oligonucleotides, each pair representing spaced apart locationson the at least one polynucleotide sequence of step a), each pairbinding adjacent to a variant polynucleotide sequence encoding an invivo formed variant protein motif; c) using the pairs of definedoligonucleotides as amplification primers for PCR to amplify the variantpolynucleotide sequences encoding the in vivo formed variant proteinmotifs of the at least one polynucleotide sequence of step a) andperforming PCR amplification on the at least one polynucleotidesequences of step a); d) obtaining one or more single-strandedpolynucleotide sequences from the amplified polynucleotide sequences instep c); e) providing one or more unmutated, specifically selected,scaffold polynucleotide sequences encoding one or more mutated peptideregions; and f) annealing said one or more single-strandedpolynucleotide sequences from step d) with the unmutated, specificallyselected, scaffold polynucleotide sequences from step e) such thatannealed polynucleotides with one or more gaps is formed, and fillingthe one or more gaps present in the annealed polynucleotides, therebygenerating one or more assembled polynucleotide sequences.